interpretability method
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
- (2 more...)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
- Oceania > New Zealand (0.04)
- (5 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
FIND: A Function Description Benchmark for Evaluating Interpretability Methods
Labeling neural network submodules with human-legible descriptions is useful for many downstream tasks: such descriptions can surface failures, guide interventions, and perhaps even explain important model behaviors. To date, most mechanistic descriptions of trained networks have involved small models, narrowly delimited phenomena, and large amounts of human labor. Labeling all human-interpretable sub-computations in models of increasing size and complexity will almost certainly require tools that can generate and validate descriptions automatically. Recently, techniques that use learned models in-the-loop for labeling have begun to gain traction, but methods for evaluating their efficacy are limited and ad-hoc. How should we validate and compare open-ended labeling tools?
Evaluating the Robustness of Interpretability Methods through Explanation Invariance and Equivariance
Interpretability methods are valuable only if their explanations faithfully describe the explained model. In this work, we consider neural networks whose predictions are invariant under a specific symmetry group. This includes popular architectures, ranging from convolutional to graph neural networks. Any explanation that faithfully explains this type of model needs to be in agreement with this invariance property.
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- (2 more...)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Maryland (0.04)
- North America > Canada (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
- (2 more...)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
- Oceania > New Zealand (0.04)
- (5 more...)
- North America > United States > Pennsylvania (0.04)
- North America > United States > Maryland (0.04)
- North America > Canada (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)